Scope:
20,000 images, 1 million+ complex polygons, 100+ objects per image, 40+ classes
Project:
AI-based image recognition software that automates waste analysis to optimize waste management process and drive recycling revenues for waste management companies.
Challenge:
Variety of classes, complexity of the image: 80-200 polygons per image, 40+ waste categories
Solution:
The annotators and validators were required to pass a rigorous training, learning basics of recycling and various classes; 2nd layer QA and desktop spot checks were applied to make sure consistency and quality of the annotation across the dataset
Annotating waste | Recycling

Problem overview

Recycling on a large scale is a challenging endeavor, especially as the production of plastic surged exponentially. Only one-fourth of solid trash in the U.S. is recycled, even less so in developing countries. Jobs to sort the trash manually are dangerous, do not pay well and the employee's rotation is high. Logistical challenges, cost, and labor intensiveness, among other factors, hinder unlocking the potential of the recycling industry.

Using robotics and AI models to identify and sort various types of waste is on top of the waste management companies agenda. Training the AI models to identify different materials, whether they're bottles or cans and whether they are misshapen or have food particles on them, would eliminate limitations of previous sorting techniques and accelerate the process.

Collecting data images used for training the AI is the most important step. Images have to be diverse in terms of waste types, light reflection, overlapping percentage, etc. The main challenge is to make sure the data collected is similar to the real case scenario in which the product is to be implemented.

Unlike some solutions that focus on collecting perfect images of scenarios that will not happen in real life (for example perfectly shaped, clean waste items that are spread apart), our clients focused on getting real-life scenario images which required a much larger dataset of extremely complex imagery to achieve a good AI accuracy and guarantee a better commercial product.

Projects

At HAIVO we delivered two projects aimed to automate and increase the productivity of the waste management process.
Diwama is developing an AI-based image recognition software that automates waste analysis to drive productivity for waste management companies.
Through a camera and a deep learning algorithm, the waste type and brand are detected and tracked across the waste value chain. The data on waste type and brand is then collected and displayed on an online dashboard which helps both waste managers to improve their operations and reduce their costs, as well as packaging brands to report on their sustainability efforts and Extended Producer Responsibility.

Diwama’s AI algorithm is implemented through 2 products, Vitron and Widen:
Vitron is for waste sorting facilities where a camera is installed over a conveyor belt and the AI algorithm accurately characterizes the waste in real-time, enabling the waste manager to make data-driven decisions to increase recycling rates and quality of recyclables.

Widen is an API integration of the AI algorithm into other applications such as mobile applications and smart bins. Diwama is currently working with a waste management company that provides a door-to-door waste collection service, where Widen is being integrated into the company’s mobile application to help users sort better at home by simply scanning the confusing waste items and the AI tells them if it is recyclable and whether it is accepted by the collection company.

How are you going to use the labeled data to help you solve the problem?

Data labeling is the heart of Diwama’s AI model, where we teach the deep learning algorithm to detect multiple classes of waste by collecting images of waste and labeling each waste item with its corresponding type. The types of annotations used are bounding boxes for when there are a few waste items per image and are not overlapping and semantic segmentation in case of many objects overlapping (to avoid labeling two objects as one).

What is the main challenge in collecting and labeling the right dataset and how are you solving it?

The collected data images used for training the AI are the most important step. Images have to be diverse in terms of waste types, light reflection, overlapping percentage, etc. The main challenge is to make sure the data collected is similar to the real case scenario in which Diwama’s product is to be implemented. Unlike other companies who focus on collecting perfect images of scenarios that will not happen in real life (an example in our case is to have perfectly shaped clean waste items that are spread apart), Diwama focuses on getting real-life scenario images that will require much larger dataset to have a good AI accuracy but will guarantee a better commercial product.

HAIVO’s labeling work

In order to train Diwama’s algorithm, an extensive dataset of waste images was collected. The labeling work required both high precision and a deep understanding of labeling classes. This project required the labeling team to build deep expertise in distinguishing particular materials, packages, objects, and even local brands.

This was a very challenging project because it required getting used to reviewing large piles of waste on very similarly-looking images for specific objects of interest. In order to address this, HAIVO trained a dedicated team that preserves the knowledge acquired from previous batches of work and which performs better with every new batch. These workers have developed a good understanding of the subtleties of the annotation process, such as labeling truncated or cropped objects, or handling materials that are not clear.

https://humansintheloop.org/picvisa-case-study/